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Abstract 

Modern networks assemble an ever growing number of nodes. However, it remains 
difficult to increase the number of channels per node, thus the maximal degree of the 
network may be bounded. This is typically the case in grid topology networks, where 
each node has at most four neighbors. In this paper, we address the following issue: if 
each node is likely to fail in an unpredictable manner, how can we preserve some global 
reliability guarantees when the number of nodes keeps increasing unboundedly ? 

To be more specific, we consider the problem or reliably broadcasting information 
on an asynchronous grid in the presence of Byzantine failures - that is, some nodes 
may have an arbitrary and potentially malicious behavior. Our requirement is that 
a constant fraction of correct nodes remain able to achieve reliable communication. 
Existing solutions can only tolerate a fixed number of Byzantine failures if they adopt 
a worst-case placement scheme. Besides, if we assume a constant Byzantine ratio 
(each node has the same probability to be Byzantine), the probability to have a fatal 
placement approaches 1 when the number of nodes increases, and reliability guarantees 
collapse. 

In this paper, we propose the first broadcast protocol that overcomes these diffi- 
culties. First, the number of Byzantine failures that can be tolerated (if they adopt 
the worst-case placement) now increases with the number of nodes. Second, we are 
able to tolerate a constant Byzantine ratio, however large the grid may be. In other 
words, the grid becomes scalable. This result has important security applications in 
ultra-large networks, where each node has a given probability to misbehave. 
Keywords: Byzantine failures, Networks, Broadcast, Fault tolerance, Distributed 
computing, Protocol, Random failures 

1 Introduction 

As modern networks grow larger and larger, their components become more likely to fail. 
Indeed, some nodes can be subject to crashes, attacks, bit flips, etc. Many models of failures 
and attacks have been studied so far, but the most general one is the Byzantine model [TT] : 
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the failing nodes behave arbitrarily. In other words, we must anticipate the most malicious 
strategy they could adopt. This encompasses all other possible types of failures, and has 
important security applications. 

In this paper, we study the problem of reliably broadcasting information in a network 
despite the presence of Byzantine failures. This is a difficult problem, as a single Byzantine 
node, if not neutralized, can potentially lie to the entire network. Our objective is to design 
a broadcast protocol that prevent or limit the diffusion of malicious messages. 

Related works. Many Byzantine-robust protocols are based on cryptography [3115]: the 
nodes use digital signatures or certificates. Therefore, the correct nodes can verify the 
validity of received informations and authenticate the sender across multiple hops. However, 
this approach may not be as general as we want, as the malicious nodes are supposed to ignore 
some cryptographic secrets: therefore, their behavior is not completely arbitrary. Besides, 
cryptographic operations require the presence of a trusted infrastructure that deals with 
public and private keys: if this infrastructure fails, the whole network fails. Yet, we would 
like to consider that any component can fail. For these reasons, we focus on cryptography- 
free solutions. 

Cryptography-free solutions have first been studied in completely connected networks [HI 
HI [121 H31 Ej: a node can directly communicate with any other node, which implies the pres- 
ence of a channel between each pair of nodes. Therefore, these approaches are hardly scalable, 
as the number of channels per node can be physically limited. We thus study solutions in 
multihop networks, where a node must rely on other nodes to broadcast informations. 

A notable class of algorithms tolerates Byzantine failures with either space [151 HI EI] 
or time [HI [HI El El E] locality. Yet, the emphasis of space local algorithms is on containing 
the fault as close to its source as possible. This is only applicable to the problems where 
the information from remote nodes is unimportant (such as vertex coloring, link coloring 
or dining philosophers). Also, time local algorithms presented so far can hold at most one 
Byzantine node and are not able to mask the effect of Byzantine actions. Thus, the local 
containment approach is not applicable to reliable broadcast. 

It has been shown that, for agreement in the presence of up to k Byzantine nodes, it is 
necessary and sufficient that the network is (2k + reconnected, and that the number of nodes 
in the system is at least 3k + 1 [I]. Also, this solution assumes that the topology is known 
to every node, and that nodes are scheduled according to the synchronous execution model. 
Both requirements have been relaxed [19J: the topology is unknown and the scheduling is 
asynchronous. Yet, this solution retains 2k + 1 connectivity for reliable broadcast and k + 1 
connectivity for detection (the nodes are aware of the presence of a Byzantine failure). In 
sparse networks such as a grid (where a node has at most four neighbors), both approaches 
can cope only with a single Byzantine node, independently of the size of the grid. 

Another existing approach is based, not on connectivity, but on the fraction of Byzantine 
neighbors per node. Broadcast protocols have been proposed for nodes organized on a grid 
[TU] [2]. However, the wireless medium typically induces much more than four neighbors per 
node, otherwise the broadcast does not work. Both approaches are based on a local voting 
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system, and perform correctly if every node has strictly less than a 1/4 fraction of Byzantine 
neighbors. This result was later generalized to other topologies [20], assuming that each 
node knows the global topology Again, in weakly connected networks, this constraint on 
the proportion of Byzantine nodes in any neighborhood may be difficult to assess. 

All aforementioned results rely on strong connectivity and Byzantine proportions as- 
sumptions in the network. In other words, tolerating more Byzantine failures requires to 
increase the number of channels per node, which may be difficult or impossible when the 
size of the network increases. To overcome this difficulty, an alternate approach has been 
proposed [16] . The idea is to make a small concession to the problem: we now aim at reliable 
communication, not between all correct nodes, but between most correct nodes. In other 
words, we now accept that a small minority of correct nodes can be fooled by the Byzantine 
nodes. This is not unrealistic, as we already accepted the idea that some nodes can fail un- 
predictably (being hit by Byzantine failures). This approach has been shown very efficient 
when the Byzantine failures are randomly distributed. This is the case, for instance, in a 
peer-to-peer overlay (the malicious nodes do not choose their localization when they join the 
overlay), or if we consider that each node has a given probability of failure. 

All existing approaches have the same weak point: if the number of channels per node 
(degree) is bounded, a fixed number of Byzantine nodes can destabilize the whole network. 
Indeed, if they adopt a sufficiently close formation, they can pretend to be the source node, 
and lie to any other node - thus, we cannot even ensure that most correct nodes communicate 
reliably. Besides, if each node has a given probability to be Byzantine, the probability that 
such a fatal formation exists approaches 1 when the number of nodes increases. Therefore, 
these approaches are hardly scalable when the maximal degree is bounded. 

Our contribution. In this paper, we propose the first broadcast protocol that overcomes 
these difficulties on a specific degree-bounded topology: the grid, where each node has at 
most four neighbors. For this protocol, the diameter of the grid can only have discrete 
values, but can be as large as we want. As in [16], our requirement is that a constant 
fraction of correct nodes achieves reliable communication. We show that the number of 
Byzantine failures that can be tolerated (if they adopt the worst-case placement) increases 
with the number of nodes: in other words, for the first time, this number is not limited by 
the maximal degree or the connectivity of the network. Besides, if we assume a constant rate 
of Byzantine failures (each node has the same probability to be Byzantine), the expected 
reliable fraction of the network is always the same, however large the grid may be. This may 
have applications in large-scale networks, where each node has a given probability to fail: we 
can now increase the size of the network indefinitely, and yet preserve the same reliability 
guarantees. 

The paper is organized as follows. In Section [2j we describe the network topology (a 
sequence of grid networks that may be as large as we want) and the broadcast protocol to 
execute on it. In Section [3j we adopt the point of view of an omniscient observer that knows 
the positions of Byzantine nodes, and give a methodology to determine a reliable node set - 
that is, a set of nodes that always communicate reliably, in any possible execution. At last, 
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in Section [4], we use the aforementioned methodology to prove the claims. 

2 Our algorithm 

In this section, we define a class of grid networks and the broadcast protocol to execute on. 

2.1 Hypotheses 

The network is constituted by a set of processes, called nodes. Some pairs of nodes are linked 
by a communication channel - we call them neighbors - and can exchange messages. Each 
node of the network has a unique identifier, which is its position on the grid. A node, upon 
receiving a message from a neighbor, knows the identifier of this neighbor. The network is 
asynchronous: any message sent is eventually received, but it can be at any time. 

2.2 Network topology 

Let N = 10. Our broadcast protocol is defined for the networks Gk, V/c > 1, Gk being a 
N k x N k grid. These networks may be as large as needed. 

Definition 1 (Grid network) An M x M grid is a network such that: 

• Each node has a unique identifier (i,j) with < i < M and < j < M. 

• Two nodes (ii, ji) and (i 2 , j 2 ) are neighbors if and only if one of these two conditions 
is satisfied: 

- h = z 2 and \j x - j 2 \ = 1. 

- ji = h and \ii - k\ = 1- 

According to our hypotheses, each node knows its identifier (i,j) on the grid, and the 
identifier (i,j) of its neighbors. Each node of Gk also knows iV and k. 

2.3 Informal description of the protocol 

Our broadcast protocol (BP) is defined by induction: we use an existing BP on G\, then use 
the BP of Gk to define the BP of Gk+i- The idea is to associate a cluster of Gk+i to each 
node of G^- Let G{p) be the cluster associated to a node p (we call it macro-node) . This is 
illustrated in Figure [TJ The goal of a macro-node G(p) is to simulate the behavior of p, so 
that we obtain a macroscopic BP in Gk+i- Then, when a node u of G(p) wants to broadcast 
a message m in Gk+i- 

1. First, u broadcasts m in G(p) with a local BP. 

2. Then, G(p) broadcasts m in G^+i with the macroscopic BP. 

The interest of this inductive definition lies in its Byzantine-resilience properties. These 
properties are studied in Section [3j 
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Figure 1: Association of a macro-node of Gk+i to each node of Gk 
2.4 Complete description of the protocol 

The BP executed on G\ is the Control Zone Protocol (CZP) proposed in [TO] . Let us give 
the methodology to construct the BP of Gk+i with the BP of Gk- For this purpose, we first 
give an algorithm to communicate between two macro-nodes (macro- channel) , then use it 
to construct the macroscopic BP. 

Macro-node. To each node p of Gk, we associate a cluster G(p) of G^+i, called macro- 
node. Let be the identifier of p. Then, G(p) is the N x N grid such that the node (0, 0) 
of G{jp) corresponds to the node (Ni, Nj) of Gk+i- 

Macro-channel. Let p and q be two neighbor nodes in Gk- We give an algorithm to 
tranfer messages from G(p) to G(q), as if they were two neighbor nodes linked by a channel. 

First, we execute the CZP on both G(p) and G(q), to enable local broadcast inside each 
macro-node. The following algorithm enables to send a message m, known by the nodes of 
G(p), to the nodes of G(q). Let Border(p) (resp. Border(q)) be the set of nodes of G(p) 
(resp. G(q)) having a neighbor in G(q) (resp. G{p)). 

1. The nodes of Border(p) send m to their neighbor in Border(q). 

2. The nodes of Border(q), upon receiving m from their neighbor in Border(q), broadcast 
m in G(q) with the CZP. 

3. The nodes of G(q), upon receiving strictly more than N/2 distinct messages (fj,m) 
trough the CZP with t>j G Border(q), accept m. 
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Pi • P 2 > P 3 > P4 




G(p,) — G(p 2 ) — G(p 3 ) — G( Pj ) 

Figure 2: Principle of the protocol 



We associate a dynamic set Sen q to each node of G(p) (storing the message to send), and 
a dynamic set Rec p to each node of G(q) (storing the messages received). We execute this 
algorithm for each pair of neighbor macro-nodes. This mechanism is illustrated in Figure [2] 

Macroscopic BP. For each node p of Gk, all nodes of G(p) execute the same algorithm 
than p, with the two following modifications: 

1. When the algorithm requires to send a message m to a neighbor q, add m to Sen q . 

2. When a message m is added to the set Rec q , consider that m was received from q. 

Now, let s be a node of G(p) that wants to broadcast a message m in Gk+i- First, s 
broadcasts (s,m) in G(p) with the CZP. Then, upon receiving (s,m), the nodes of G{p) 
broadcast (s,m) with the macroscopic BP. Thus, the nodes receiving (s,m) know that s 
broadcast m: we now have a BP on G^+i- 

3 Construction of a reliable node set 

In this section, we now assume that some nodes are Byzantine, and behave arbitrarily instead 
of following the aforementioned protocol. We adopt the point of view of an omniscient 
external observer, knowing the positions of Byzantine nodes, and give a methodology to 
determine a reliable node set - that is, a set of nodes that communicate reliably in any 
possible execution. This methodology is used in Section [4] to prove the claims. Notice that 
we never require that a node determines such a set: this is just a global view of the system. 
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Notion of reliable node set. The nodes following the aforementioned protocol are called 
correct. The correct nodes do not know the positions of Byzantine nodes. 

Definition 2 (Reliable node set) For a given broadcast protocol (BP), a set of correct 
nodes is reliable if, for each pair of nodes s and r of this set: 

1. If s broadcasts m, r eventually accepts (s,m). 

2. If r accepts (s,m), r necessarily broadcast m. 

In other words, a reliable node set behaves like a network without Byzantine failures. The 
item (1) guarantees that the nodes always manage to communicate. The item (2) guarantees 
that no node of the reliable set can be fooled - for instance, if a Byzantine node broadcasts 
(s, ml) to make the network believe that s broadcast ml . 

Construction of a reliable node set. Let Corr be a set of correct nodes of Gk- Let us 
define a function Rel^ such that Relk(Corr) returns a reliable node set for our BP. For this 
purpose, we first introduce some new elements. 

In [TB], we gave a methodology to determine a reliable node set for the CZP on an 
N x N grid, for a given set Corr of correct nodes. Let Relczp be a function such that 
Relczp(Corr ) returns a reliable node set for the CZP. 

At last, we introduce the notion of correct macro-node. In broad outline, a correct macro- 
node behaves like a correct node in the macroscopic BP. This intuitive idea is the key element 
of the next theorem. 

Definition 3 (Correct macro-node) Let there be an N x N grid with a distribution Corro 
of correct nodes. This grid (or macro-node) is said correct if each side of the grid (up, down, 
right and left), among its N nodes, has strictly more than 3N/4 nodes in Relczp(Corro). 

The underlying idea of this definition is the following: the reliable node sets of two 
adjacent correct macro-nodes are always connected by a majority of channels (strictly more 
than N/2). Therefore, the messages exchanged between these two reliable sets always receive 
a majority of votes. This idea is illustrated in Figure |3j and used in the proof below. 
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Figure 3: Reliable communication between 2 correct macro- nodes 



We can now define the function Rel k by induction, \/k > 1: 

• Reh = Relczp 

• Rel k+1 (Corr) = {j p&Relk{Corrl) Rel CZ p(Corr{p)), where . . . 

— Corr is a distribution of correct nodes on G k+ %. 

— Corr{p) is the corresponding distribution on G(p). 

— Corr' is the set of nodes p of G k such that G(p) is a correct macro-node. 
In the following, we refer to Relczp(Corr(x)) by Rel{x). 

Theorem 1 Vk > 1, if Corr is a distribution of correct nodes on G k , then Rel k (Corr) is a 
reliable node set for our BP. 

Proof: The main idea of the proof is to show an equivalence between the execution on G k+ \ 
and a virtual execution on G k (this, of course, does not mean that G k must actually exist 
for G k+ \ to work). 

The proof is by induction. The property is true at rank 1 by definition. Now, let us 
suppose that the property is true at rank k, and show that it is true at rank k + 1. Let Corr 
be a distribution of correct nodes on G k +i, and let s and r be two nodes of Rel k+ i{C orr) . 
Let us suppose that s broadcasts m in G k +i- Then, to show that Rel k+ \{Corr) is a reliable 
node set, we show that the items (1) and (2) of Definition [2] are satisfied. 
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1. We call accumulative a distributed algorithm where each node holds a given number 
of dynamic sets Si, S2, S3 . . ., can only add elements to these sets (S{ «— Si U {x}), 
and eventually executes an action when a given collection of elements has joined these 
sets: (Xl C S\) A (X 2 C 5*2) A . . .. The CZP is accumulative, and so is our BP, as it 
is an inductive combination of accumulative algorithms. In other words, the order of 
reception of messages is unimportant in our BP. 

Let p and q be the nodes of Gk such that s belongs to G{p) and r belongs to G(q). 
By definition of Relk+i, p and q belong to Relk{C orr') . Let us suppose that Corr' is 
a distribution of correct nodes on Gk- Then, Relk(Corr') is a reliable node set on Gk- 
Therefore, if p broadcasts (s,m), there exists a sequence of message receptions such 
that q eventually accepts (s,m). Let (Ri, R2, ■ ■ ■ , Rm) be this sequence, Ri being a 
triplet (qi,mi,pi) such that qi receives m 8 from p iy with p\ = p and qu = q- Let us 
prove the following property Vi by induction, Vz G {1, . . . , M}: all the nodes of Rel^q/) 
eventually add to Rec Pi . 

• First, let us show that V\ is true. According to our BP, s initially broadcasts 
(s,m) in G{p). Therefore, as p = pi, all the nodes of Rel(pi) eventually accept 
(s,m). Then, as they execute the same alogorithm than pi, they add m\ to their 
set Sen qi . 

Let Border(qi) be the set of nodes of G(qi) having a neighbor in G(pi). As G(q\) 
and G(pi) are two correct macro-nodes, according to Definition |3j strictly more 
than N/2 nodes of Rel(pi) have a neighbor in Rel(qi). Therefore, strictly more 
than N/2 nodes of Border(qi) D Rel(qi) eventually receive m 1; and broadcast it 
in G(qi). So all the nodes of Rel(qi) eventually receive strictly more than N/2 
messages (t> x .,mi) with v x € Border(qi) and add mi to Rec Pl . Thus, V\ is true. 

• Now, let us suppose that Vj is true Vj < i. Then, as the order of reception of 
messages is unimportant, all the nodes of Rel(pi + \) eventually behave as Pi+i, and 
add m i+ i to Sen qi+1 . 

Thus, by a perfectly similar demonstration, Vi+i is true. 

Then , as r e Rel(q), according to Vm- t eventually receives the same messages as 
q = qu and accepts (s,m). Thus, the item (1) of Definition [2] is satisfied. This is 
illustrated in Figure |4j 

2. The proof is by contradiction. Let us suppose the opposite: r accepts a message (s, m), 
yet s did not broadcast m. Let p be the node of Gk such that r e Rel(p ). If we 
also have s G Rel(p ), it is impossible that r accepts (s, m), as Rel(p ) is a reliable 
node set. So s necessarily belongs to another macro- node. Similarly than above, let 
us suppose that Corr 1 is a distribution of correct nodes on Gk- Then, as Relk(C 'orr ') 
is a reliable node set on Gk, r necessarily received a message that po cannot receive in 
Gk- Let us show that this is impossible. 

Let u be the first node of Relk+i {Corr) (possibly r), belonging to a macro-node G(q), 
to receive a message ml that q cannot receive in Gk- Let G(p) be the macro-node 
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Figure 4: Illustration of the proof (1) : what occurs in Relk{Corr') eventually occurs in 
Relk+i{Corr) 



sending this message. If G(p) is not correct (in the sense of Definition [3]), then p does 
not belong to Corr', is assumed to be Byzantine on Gk, and can actually send ml to 
q - so G(p) is necessarily correct. It implies that u received strictly more than N/2 
messages (/Uj, m!) with Vi £ Border(q). As G(p) and G(q) are two correct macro-node, 
strictly more than N/2 nodes of Rel(p) have a neighbor in Rel(q). So at least one 
of the nodes Vi belongs to Rel(q) and received m' from a neighbor v £ Rel(p). As 
Rel(p) is a reliable node set, the only possibility is that v received a message that p 
cannot receive in Gk- So u is not the first node in this situation, which contradicts the 
initial statement. Thus, the item (2) of Definition [2] is satisfied. This is illustrated in 
Figure |5j 

□ 

We now have a methodology to determine a reliable node set for a given distribution of 
Byzantine nodes on G^, VA; > 1. In the next section, we use this methodology to prove the 
claims. 

4 Proof of the claims 

In this section, we finally prove the claims of the paper: the number of Byzantine failures 
that can be tolerated increases with the number of nodes (if they adopt the worst-case 
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Figure 5: Illustration of the proof (2) : a node of Relk+\{Corr) cannot misbehave 



placement), and a constant rate of Byzantine failures can be tolerated, however large the 
grid may be. As in [16], our requirement to tolerate Byzantine failures is that a constant 
fraction of the network communicates reliably. 

4.1 Worst-case placement 

Let us give a minimal number of Byzantine failures that can be tolerated when they adopt 
an arbitrary placement (possibly the worst). 

Theorem 2 V7c > 1, on a grid Gk with at most 2 k ~ 1 Byzantine failures (arbitrarily placed), 

4 

the fraction of the network achieving reliable communication is at least 1 — — - . 

Proof: The proof is by induction. For k = 1, we can test all possible placements of a single 

Byzantine failure (as N — 10) and show that the property is true. Now, let us suppose that 

the property is true at rank k. Let there be 2 k Byzantine failures arbitrarily placed on Gk+i- 

Then, at most 2 k ~ 1 macro-nodes of Gk+i contain more than 2 Byzantine failures. Again, 

by testing all possible cases, we can show that an iV x iV grid with at most 1 Byzantine 

failure is always correct in the sense of Definition [3] So at most 2 fc_1 macro-nodes are not 

correct. Therefore, as the property is true at rank k, the reliable node set covers at least 
4 

a 1 — — - fraction of macro-nodes (and in this worst case, all these macro-nodes have only 

correct nodes). Thus, according to the definition of Relk+i, the property is true at rank 
k+ 1. This is illustrated in Figure [6] 

□ 
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So we can always tolerate 2 k ~ 1 failures on G k . As the parameter k sets the size of the 
grid, this number increases with the number of nodes. To our knowledge, this is the first time 
that this number is not limited by the connectivity or the maximal degree of the network. 

4.2 Random distribution 

Let us assume a constant rate of Byzantine failures (each node has the same probability A 
to be Byzantine) and give the expected reliable fraction of the network. Let \x = 1 — A be 
the probability that a node is correct. 

Theorem 3 VA; > 1, let Fk(fi) be the expected reliable fraction ofGk- Then, if // > 1 — 10 5 , 
we have F k (ii) > 1 — 10~ 4 . 

Proof: Let there be an N x N grid where each node has the same probability fio to be 
correct. We call P{no) the probability that the two following events occur: 

1. The grid is correct in the sense of Definition [3j 

2. A node, chosen uniformly at random, belongs to Relczp{Corro), Corr^ being the 
distribution of correct nodes on the grid. 

i=k 

We want to prove the following property by induction: > TTp^/i), P l being the i th 

i=l 

application of the function P. The property is true at rank 1, as F\{fi) > P{fJ>)- 

Now, let us suppose that the property is true at rank k. Let Corr be the distribution 
of correct nodes on G k +i- Let u be a randomly chosen node of G^+i, and let p be the 
node of Gk such that u belongs to the macro-node G(p). According to Theorem [TJ to have 
u G Relk+i(Corr), it is necessary and sufficient that (1) u G Rel(p) and (2) p G Relk{Corr'). 
The first event occurs with probability P\ > P(fi), and if so, the second event occurs with 

i=k+l 

probability P 2 > F k {P(ji)). Thus, Fk+i(p) > P{n)F k {P(ji)) = ] [ P\n): the property is 

i=i 

true at rank k + 1. This is illustrated in Figure [7j 

Now, let us give a lower bound of P(f-io). We consider two disjoint cases: 

1. The case where all the nodes of the N x N grid are correct, which occurs with proba- 
bility fiQ 2 . In this case, Relczp{Corr Q ) covers the whole grid, and the grid is correct 
in the sense of Definition [U 

2. The case where one single node is Byzantine, which occurs with probability iV 2 (l — 
Ho) l^o l - As N = 10, we evaluate Relczp(Corr ) for the 100 possible placements 
of the single Byzantine node. In 64 cases, this set contains 99 nodes. In 32 cases, 
it contains 98 nodes. In 4 cases, it contains 96 nodes. Thus, the probability that a 
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Figure 7: Sufficient condition for u to be in Relk+i(Corr) 
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Figure 8: Different cases for the placement of 1 Byzantine node on an iV x N grid 



randomly chosen correct node belongs to this set is a 
199 

. In all cases, the grid is correct in the sense of Definition 

200 _ 6 
Figure P 



64 x 99 + 32 x 98 + 4 x 96 



100 x 99 



> 



This is illustrated in 



So > <?(//) = [i N2 + aN 2 (l — jj)jJ, N2 1 . This function is convex 



< for 



H > a. Let /3 = 1 - 10~ 5 > a. Then, V// > /?, > /(7,/x) = 1 - 7(1 - //), with 

7 = ^~~n~ ■ Then, we easily show by induction that V& > 1, -P fc (/i) > /(7 fc ,/i). So 

1 

i=fc 

F k (jj)>H h (jM) = ]lf('y i ,n). 



i=i 
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We now have a lower bound of F k (/i), but it may be hard to calculate when k approaches 
infinity. To overcome this difficulty, let i$ be the first integer such that, Vz > io, 7* < 

r 

So H k (/i) > J^J /(7\a0 ] [ (1 ^")- Then, when k approaches infinity, we can apply 

1=1 1=20 + 1 

the Wallis formula: lim H k (ji) > TT fh\ /x) sm ^ 1 ~ ^ > 1 _ i " 4 if > /?. Thus, the 

1=1 v 

result, as H k (fj,) decreases with k. □ 
Therefore, we can hold a constant rate of Byzantine failures and yet have a constant 
expected fraction of reliable nodes, however large the grid may be. This may have important 
security applications - for instance in a computationnal grid where each processor has a 
given probability to misbehave. This result shows that, for a given security requirement, we 
can increase the size of the grid indefinetely, which could be a solution to the problem of 
scalability. 



5 Conclusion 

In this paper, we have shown that Byzantine resilience was possible in a scalable degree- 
bounded network. If the adversary can place the Byzantine nodes arbitrarily, then for the 
first time, we can tolerate a number of Byzantine failures that largely exceeds the node 
degree. If not (random distribution), then we can tolerate a constant fraction of Byzantine 
nodes, even if the size of the network approaches infinity. 

We have the strong conviction that this approach (slice the network into clusters, then 
slice each cluster into smaller clusters, etc . . . ) can be generalized to less regular topologies. 
Indeed, the notion of a correct macro-node (see Definition [3]) can be generalized to an 
arbitrary graph - the key idea is that, for each interface with another macro- node, we 
must still have a 3/4 fraction of reliable nodes. Besides, the network diameter can only have 
discrete values here, but we could generalize the result to any network diameter. 
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